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This study extended the work on curriculum-based measurement to examine the criterion-related va- 
lidity of curriculum-based measures in written expression for middle school students, the differences 
in validity coefficients for various lengths of text, and the sensitivity of curriculum-based measures 
to change in student performance. Curriculum-based measures were the number of correct word se- 
quences (CWS) and correct minus incorrect word sequences (CIWS) written in expository essays. Cri- 
terion measures were the number of functional elements in and quality ratings of student essays. Results 
revealed a strong relationship between curriculum-based and criterion measures. 


Curriculum-based measurement (CBM) is a system of mea- 
surement that can be used by teachers to monitor student 
progress over time and to evaluate the effects of instructional 
programs (Deno, 1985). Research on CBM at the elementary 
school level has demonstrated that simple and efficient mea- 
sures can be used as general indicators of student performance 
in an academic area (Deno, 1985). For example, in written 
expression, the number of words written, the number of words 
spelled correctly, and the number of correct word sequences 
(i.e., two adjacent correctly spelled words acceptable within 
the context to a native English speaker) written in 3 minutes 
all correlate at a moderate to moderately strong level with other 
measures of students’ writing performance. These measures 
include scores on standardized achievement tests, holistic eval- 
uations of writing, and teacher evaluations of writing ability 
(see Marston, 1989). Further, when CBM procedures are used 
by teachers to monitor student progress and evaluate the ef- 
fects of instructional programs, students achieve more (Fuchs, 
1998). 

Research on CBM has revealed that the measures used 
at the elementary level are not necessarily reliable and valid 
and at the secondary level (see Espin & Tindal, 1998). Eor ex- 
ample, in the area of written expression, simple scoring met- 
rics such as the number of words written and the number of 
words spelled correctly in a limited time frame (e.g., 3-6 min- 
utes) have not been shown to be valid and reliable indicators 
of general writing proficiency for secondary students. Instead, 
somewhat more complex scoring systems involving the use of 
correct word sequences (CWS) seem to be required (Espin, 


Scierka, Skare, & Halverson, 1999; Espin et al., 2000; Few- 
ster & MacMillan, 2002; Parker, Tindal, & Hasbrouck, 1991a, 
1991b; Tindal & Parker, 1989; Watkinson & Lee, 1992). 

CBM Writing Research 
at the Secondary Level 

Tindal, Parker, and colleagues conducted the initial research 
on the development of CBM measures in written expression 
for students at the secondary level (Parker et al., 1991a, 1991b; 
Tindal & Parker, 1989). Their research pointed to the use of 
either the number (Parker et al., 1991b; Tindal & Parker, 1989) 
or the percentage (Parker et al., 1991a) of correct word se- 
quences as valid indicators of student performance in written 
expression. The CWS scores were valid at both middle and 
high school levels, although correlation coefficients were some- 
what stronger at the middle school level than at the high 
school level (Parker et al., 1991a). Neither the number nor the 
percentage of CWS resulted in regular increases across the 
school year (Parker et al., 1991b). Percentage measures were 
seen to present unique problems with respect to growth mon- 
itoring because, by their nature, percentage measures are not 
sensitive to change in performance (Parker et al., 1991b; Tin- 
dal & Parker, 1989). If a student writes 10 word sequences at 
the beginning of the year with 5 correct, the percentage score 
is 50%. If that same student writes 50 word sequences at the 
end of the year with 25 correct, the percentage score remains 
50%. No change in performance is reflected in the score. 
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Because of the inherent problems associated with use of 
percentage scores for growth monitoring, subsequent research 
focused on the number rather than the percentage of correct 
word sequences. Espin and colleagues (Espin et al., 1999; Es- 
pin et al., 2000) confirmed the validity and reliability of the 
number of CWS as an indicator of general writing perfor- 
mance and introduced a new scoring procedure, the number 
of correct minus incorrect word sequences (CIWS; Espin et al., 
2000). In the Espin studies, both CWS and CIWS were found 
to correlate at moderate to moderately strong levels with holis- 
tic ratings of writing performance. Similar to the findings of 
Parker et al. (1991a), correlation coefficients were somewhat 
stronger at the middle school level (Espin et al., 2000) than at 
the high school level (Espin et al., 1999). At the middle school 
level, both CWS and CIWS were found to have acceptable 
alternate-form reliabilities, and validity and reliability were 
found not to differ as a function of type of text (story writing 
vs. descriptive writing) or writing time (3 min vs. 5 min; Espin 
et al., 2000). Effects of type of text and writing time were not 
examined at the high school level. 

Purpose of the Study 

Although the research that has been conducted thus far on CBM 
written expression at the secondary level has been consistent 
in its findings regarding the potential use of CWS scoring met- 
rics, several issues have yet to be addressed. Eor example, in 
previous research, only two forms of writing have been used 
(story writing and descriptive writing) and the length of the 
writing sample has been limited to 6 min or less. The validity 
of CWS and CIWS for indexing student writing performance 
using other forms of writing and the validity with respect to 
varying text lengths is not known. 

In addition, in previous research, the criterion variable 
has been primarily holistic ratings of students’ writing. These 
holistic ratings have been conducted on samples that have not 
been corrected for basic writing elements, such as spelling, 
punctuation, and capitalization. It is possible that the correla- 
tions found between the CBM scoring metrics and holistic 
ratings have been a function of the influence of basic writing 
elements on both of these measures. The validity of CWS and 
CIWS with respect to higher level elements of writing, such as 
content, coherence, and completeness, has not been examined. 
Einally, research has not yet examined the validity of CWS 
and CIWS for indexing change in performance over time. 

In the current study, we addressed several previously un- 
explored issues. Eirst, we examined the reliability and valid- 
ity of CBM scoring metrics using a different genre of writing; 
expository essays. Second, we used as criterion variables the 
number of functional elements (units in the essay supporting 
the development of the writer’s paper) and quality ratings of the 
essays after the essays had been corrected for spelling, punc- 
tuation, and capitalization. These criterion measures reflect 
the content, coherence, and completeness of the essay. Einally, 
we examined the sensitivity of CWS and CIWS for detecting 


improvements in writing over time. Eour research questions 
were addressed in this study: 

1 . What is the relationship between the number of 
CWS and CIWS written in expository essays 
and the number of functional elements in- 
cluded in those essays? 

2. What is the relationship between the number of 
CWS and CIWS written in expository essays 
and quality ratings of those essays? 

3. Do the relationships between the number of 
CWS and CIWS written in expository essays 
and the criterion variables differ with respect to 
the length of the text? 

4. Are CWS and CIWS sensitive to changes in 
students’ writing performance over time? 

Based on previous research at the middle school level, 
we hypothesized that the relationship between CBM and cri- 
terion measures would be moderate to moderately strong. We 
made no hypotheses regarding the influence of text length nor 
the sensitivity of the measures to change over time because, 
to date, little research has been conducted on these issues at 
the secondary level. 

Data Set 

One difficulty associated with examining the validity of CBM 
writing measures for indexing growth is that, because instruc- 
tion in composition is not always a part of the regular curricu- 
lum at the secondary school level (e.g., see Greenwald, Persky, 
Campbell, & Mazzeo, 1999), it is never clear whether a lack 
of improvement on the CBM measures is due to technically 
inadequate measures or to a lack of improvement in students’ 
writing proficiency. Our study consists of a reanalysis of data 
collected as a part of an earlier study designed to investigate 
the effects of strategy instruction on the writing performance 
of middle school students with and without learning disabil- 
ities (ED; see De La Paz, 1999). Instruction was designed to 
teach students to plan a composition in advance of composing 
and to continue planning throughout the composing process. 
Previous research has found that students with and without 
ED do little advanced planning (Scardamalia & Bereiter, 1986); 
yet, when taught to do so, these students produce substantially 
better papers (Danoff, Harris, & Graham, 1993; De La Paz & 
Graham, 1997a, 1997b, 2002; Graham, MacArthur, Schwartz, 
& Page-Voth, 1992; Harris & Graham, 1985; MacArthur, 
Schwartz, Graham, Molloy, & Harris, 1996; Page-Voth & 
Graham, 1999). 

The advantage of using an existing data set was that we 
could be assured that an intensive intervention had been de- 
livered to the students and that the students’ writing perfor- 
mance had improved. Results of the original single-subject 
design study revealed that students increased the length, qual- 
ity, and completeness of their essays following implementa- 
tion of strategy instruction. 
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Method 

Participants and Setting 

Participants in the study were 22 students (11 boys and 11 
girls) in the seventh and eighth grades. Students were selected 
from five different classrooms of three language arts teachers 
in two suburban middle schools in the southeastern part of 
the United States. The schools had populations of 504 and 540 
students. Students in both schools were primarily Caucasian 
(approximately 94%), with a small number of African Amer- 
ican, Asian American, and Hispanic students. Eighteen per- 
cent of students in the first school and 1 2% of students in the 
second school received free and reduced-cost lunches. Less 
than 1% of the students received services in English as a sec- 
ond language. In the current study, 91% of the participants were 
Caucasian and 9% were African American. 

Participants included students diagnosed with LD (« = 
6), and low- (n = 6), average- (« = 6), and high-achieving writ- 
ers (« = 4). Students without LD were classified into low-, 
average- and high-achieving groups, based on their scores 
on the written expression subtest of the Wechsler Individual 
Achievement Test (WIAT; 1992). Low-achieving writers (LA) 
were those with standard scores of 79 to 91, average-achieving 
writers (A A) were those with scores of 96 to 105, and high- 
achieving writers (HA) were those with scores of 1 16 to 123. 

Students in the LD group had been identified as LD 
according to district criteria. Students had verbal IQ scores 
between 85 and 125 on individually administered norm- 
referenced intelligence tests, scored at least 1 SD below average 
in reading, writing, and/or mathematics on a norm-referenced 
achievement test, had no other handicapping conditions, and 
used English as their primary language. Students with LD who 
participated in the study had been nominated by their teach- 
ers as having difficulty with writing composition. The aver- 
age WIAT standard score for the LD students was 8 1 . 

Students’ percentile rank scores on the Language Arts sub- 
test of the Comprehensive Tests of Basic Skills (1989), a group- 
administered achievement battery, were as follows: LD, 28; 
LA, 41;AA, 70; and HA, 73. 

Procedures 

Students wrote expository essays. Expository essays were cho- 
sen because seventh- and eighth-grade students were required 
to write expository essays to pass the state’s competency test. 
A bank of topics was developed based on previous state exams. 
This bank was then shown to one special education and two 
general education middle school teachers, who eliminated or 
modified topics, based on interest and difficulty levels for mid- 
dle school students. The following are some examples of top- 
ics: “Choose a country you would like to visit. Write an essay 
explaining why you want to go to this country,” “Think about 
how students can improve their grades. Write an essay telling 
why it is important to get good grades, and explain how students 


can improve their grades,” and “Think about rules you think 
are not fair. In an essay, state what rules you think should be 
changed, and give reasons explaining why you think so.” 

Essays were administered and monitored by the class- 
room teacher. Teachers provided students with a copy of the 
topic, read the topic aloud, and then read the following direc- 
tions: 

Look carefully at the prompt and make up a good 
essay to go with it. Remember to plan your essay 
before you begin writing. Try to remember every- 
thing you know about writing essays. Also, it is 
okay to change your plan or go back to add ideas 
to your plan when you are composing your essay. 

Do you understand these instructions? After we 
begin writing, I cannot help you with anything. 

Students were given 35 minutes to write their essays by hand. 
No assistance was given to students for spelling or grammar. 

Students wrote a minimum of six expository essays at 
the beginning of the study. Eollowing collection of the pretest 
data, students were instructed in writing using composition 
strategies designed to help them plan, organize, and write 
expository essays. Instruction was 4 weeks long, averaging 
4 days per week. Within 1 week following instruction, students 
were again asked to write expository essays. Results of the 
multiple-baseline study revealed that the students improved in 
their writing performance. Students in all four groups (LD, LA, 
AA, and HA) wrote longer, more complete, and higher qual- 
ity essays (see De La Paz, 1999, for details). 

To address the research questions for the current study, 
a random sample of three pretests and posttests were selected 
from each student to be scored for the CWS and CIWS. CWS 
was defined as any two adjacent, correctly spelled words that 
were acceptable within the context of the sample to a native 
English speaker. Acceptable meant that a native speaker would 
judge the word sequence as syntactically and semantically 
correct (Videen, Deno, & Marston, 1982). End punctuation 
and beginning capitalization were also taken into account in 
scoring CWS and CIWS (Tindal & Parker, 1989). Pretests and 
posttests were scored by the first author and two graduate stu- 
dents. Rules for scoring CWS and CIWS were reviewed and 
then coders scored several essays together, discussing issues 
as they arose. Coders then independently scored approxi- 
mately 10% each of the pretests and posttests. Percentage of 
scoring agreement between pairs of coders was calculated by 
dividing agreements by the total number of agreements plus 
disagreements and multiplying by 100. Rates of agreement 
between pairs of coders were 96.62%, 97.49%, and 97.06% 
for CWS and 90.02%, 90.32%, and 91.23% for CIWS. 

Criterion variables in the study were the number of func- 
tional essay elements and quality ratings of the essays. Eunc- 
tional elements were defined as units in the essay that directly 
supported the development of the writer’s paper (Graham, 
1990). Eunctional elements included premises (statements spec- 
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ifying the writer’s position on the topic), reasons (explanations 
to support or refute a position), elaborations (extensions or ex- 
amples of a premise, reason, or conclusion), and conclusions 
(closing statements). Nonfunctional elements were units that 
were repeated without any discernible rhetorical purpose, were 
unrelated to the topic, or were not appropriate for an exposi- 
tory genre. Essays were divided into minimally parsable units 
(i.e., the smallest units of an argument that convey meaning; 
Scardamalia, Bereiter, & Goelman, 1982) and were scored as 
functional or nonfunctional. The number of elements in each 
essay ranged from 5 to 40. The second author scored all essays 
for functional elements. Twenty-five percent of the essays were 
scored by an independent rater. Interrater reliability for the 
total number of functional essay elements, determined in the 
same manner as percentage of scoring agreement, was 84%. 

The quality of the essays was assessed by trained raters 
using a holistic rating system. Raters were unfamiliar with the 
purpose or design of the study. Prior to scoring, essays were 
typed and corrected for spelling, punctuation, and capitaliza- 
tion, to remove the effects of these factors on the ratings as- 
signed to each essay (see Graham, 1992). These factors would 
especially penalize students with LD, who make considerably 
more mechanical errors than their normally achieving peers 
(Deno, Marston, & Mirkin, 1982). 

Essays were scored by two general education teachers 
( 1 seventh- and 1 eighth-grade teacher from another suburban 
middle school) who were unfamiliar with the design of the 
study. Raters scored the essays on the basis of their general 
impression of overall quality. Essays were rated on a scale of 
0 to 7, with 0 = low and 7 = outstanding. Raters were instructed 
to consider the ideas and development of the essay; the or- 
ganization, unity, and coherence; and the breadth of the vo- 
cabulary in assigning a score to the essay. Anchor points were 
established by selecting a high essay (score of 7), a middle 
essay (score of 4), and a low essay (score of 1). These essays 
were obtained from seventh- and eighth-grade students who 
were in the target schools but were not participating in the 
study. Interrater agreement between the two raters, as calcu- 
lated by Pearson product-moment correlations, was .90. Dif- 
ferences between raters were resolved through discussion. 

In addition to the functional elements and quality ratings, 
the number of words written in the essays was scored via com- 
puter. Any word that represented a spoken word, regardless of 
its spelling, was counted as a written word. 

Results 

Relationship Between CBM Scores and 
Criterion Measures 

To address the first two research questions, correlations be- 
tween the CBM scoring metrics (CWS and CIWS) and the cri- 
terion measures (functional elements and quality ratings) were 
examined. Pretest and posttest scores for students on the CBM 


TABLE 1. Means and Standard Deviations on Pretest 
and Posttest for CBM and Criterion Measures 


Measure 

Pretest 

Posttest 

M 

SD 

M 

SD 

CWS 

97.74 

47.96 

183.93 

45.33 

CIWS 

80.33 

47.93 

151.39 

56.65 

Number of words written 

106.83 

46.11 

203.77 

36.96 

Functional elements 

12.18 

3.06 

27.08 

4.77 

Quality ratings 

2.58 

.63 

5.26 

.86 


Note. CBM = curriculum-based measurement; CWS = correct word sequences; 
CIWS = correct minus incorrect word sequences. 


TABLE 2. Correlations Between CBM and Criterion 
Measures 



Functional 

elements 

Quality ratings 

Measure 

Pretest 

Posttest 

Pretest 

Posttest 

CWS 

.70 

.79 

.83 

.68 

CIWS 

.70 

.66 

.82 

.67 

Number of words 
written 

.68 

.90 

.82 

.58 


Note. CBM = curriculum-based measurement; CWS = correct word sequences; CIWS = 
correct minus incorrect word sequences. All correlations significant at p < .01. 


and criterion measures are presented in Table 1 . Correlations 
between the measures are presented in Table 2. Correlations 
for pretests and posttests were calculated separately. The mag- 
nitude of the correlations between the predictor and criterion 
variables was strong, ranging from .66 to .83. In general, the 
obtained correlations for CWS and CIWS were similar in mag- 
nitude. Correlations between the CBM scoring metrics and the 
quality posttest rating were lower than for the quality pretest 
ratings, most likely due to a bunching of posttest quality scores: 
While the overall range of scores for the posttest was greater 
than for the pretest (thus the larger standard deviation for the 
posttest), there was a greater bunching of scores on the post- 
test, with 17 of 22 students receiving average quality ratings 
between 5 and 7. 

Moderately strong to strong correlations between the 
number of words written in the essay and the dependent vari- 
ables were also found (r = .58-.90; see also Table 2). Although 
previous research on writing (see Hillocks, 1986, for a review) 
has revealed a relationship between length and essay quality, 
our finding is unusual in light of previous CBM research at 
the secondary level. In that research, the relationship between 
the number of words written and other measures of written 
expression proficiency, including quality of writing, has been 
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in the low to moderate range (r = .0-.47; Espin et ah, 1999; 
Espin et al., 2000; Parker et al., 1991b; Tindal & Parker, 1989). 

With our third research question, we further explored the 
issue of length of text. To address this question, we calculated 
the correlations between CWS and CIWS in the first 50 words 
of the writing sample and the criterion variables. This analysis 
addressed the issue of whether length alone was responsible 
for the correlations between the CBM and the criterion mea- 
sures, or whether the number of correct and incorrect word 
sequences also were important factors. 

Examination of scattergrams between the CBM and cri- 
terion variables revealed an outlier in the correlations for the 
pretest scores (see Note 1). This outlier was removed for sub- 
sequent analyses. Means and standard deviations for the num- 
ber of CWS and CIWS on the pretest and posttest for the 
50- word sample are reported in Table 3. Correlations between 
predictor and criterion variables are reported in Table 4. 

As might be expected, the magnitude of the correlations 
is lessened when the number of words written is limited to 
50 words because limiting the length also limits the range of 
CWS and CIWS scores. Nevertheless, the obtained correla- 
tions are still quite respectable, ranging from .33 to .59, and are 
in line with the results of McCulley (1985), who found corre- 
lations of .41 between measures of text cohesion and quality 
when length was held constant. These results indicate that the 
relationship between CWS and CIWS is not solely a result of 
the influence of length of text: Other factors contribute to the 
correlations. 

Sensitivity to Change Over Time 

Our final research question addressed the sensitivity of the 
CBM scoring metrics to change in performance over time. To 
address this question, we examined the differences from 
pretest to posttest for both the CBM scores and the criterion 
measures. We expected significant pre-post differences in the 
criterion measures, given the improvements demonstrated in 
the single-subject design study conducted by De La Paz (1999). 
Our primary interest, however, was whether the CBM scoring 
metrics would also be sensitive to change in performance over 
time. Due to the unexpected results related to the number of 
words written in text, we present words written in this set of 
analyses as a potential CBM scoring metric. 

Given the multiple number of dependent variables we 
wished to analyze, we first ran a MANOVA, with time (pretest 
to posttest) as a within-subjects factor. Dependent variables 
entered into the analysis included functional essay elements, 
quality ratings, CWS, CIWS, and number of words written. 
Results of the MANOVA revealed significant effects, A = . 13, 
F(5, 38) = 52.57, p <. 001. Eollow-up univariate F tests re- 
vealed significant changes from pre- to posttests on the num- 
ber of functional essay elements, F{1, 42) = 151.83,p < .001, 
Tj^ = .78, and quality ratings, ^(1, 42) = 139.59, p < .001, = 

.77, confirming the results found in the De La Paz (1999) 
single-subject design study. Of interest to us was the fact that 


TABLE 3. Means and Standard Deviations for CBM 
Measures on First 50 Words of Pretest and Posttest 



Pretest® 

Posttest 

Measure 

M 

SD 

M 

SD 

CWS 

44.67 

5.39 

46.86 

4.24 

CIWS 

36.10 

10.17 

39.55 

8.62 


Note. CBM = curriculum-based measurement; CWS = correct word sequences; 
CIWS = correct minus incorrect word sequences. 

^Means and standard deviations of pretest before outlier was removed were as 
follows: M = 43.21, SD = 8.64, for CWS; M = 33.61, SD = 15.30, for CIWS. 


TABLE 4. Correlations Between CBM Scores for First 
50 Words and Criterion Measures 



Functional elements 

Quality ratings 

Measure 

Pretest® 

Posttest 

Pretest® 

Posttest 

CWS 

.43* 

.35 

.59** 

.56** 

CIWS 

.44* 

.33 

.58** 

.54* 


Note. CBM = curriculum-based measurement; CWS = correct word sequences; 

CIWS = correct minus incorrect word sequences. 

^Correlations between the number of functional elements and CWS and CIWS before 
the outlier was removed were r =.53 and r = .54, respectively. Correlations between 
quality ratings and CWS and CIWS were r = .43 and r = .44, respectively. 

*p <. 05. < .01. 


significant differences between pre- and posttest also were 
found for CWS, T(l, 42) = 37.52, p < .001, =.47; CIWS, 

T(l, 42) =20.18, p <. 001, r|2 = 32; and number of words writ- 
ten, F(l, 42) = 59.21, p < .001, r|2 = ,59. These differences 
indicate that the CBM scoring metrics also were sensitive to 
change over time. Of note, the eta-squared value for number 
of words written was larger than that for CWS and CIWS. 

The number of students within each group was too small 
to allow for statistical testing of group differences in growth; 
however, inspection of the obtained group differences in 
growth reveal patterns that are worthy of mention. Figures 1 
and 2 display changes over time by group for the number of 
CWS and CIWS in the entire essay. Examination of these fig- 
ures reveals that although, as might be expected, the levels of 
performance for the LD and LA groups are below that of the 
AA and HA groups, rates of growth are fairly similar for stu- 
dents in all four groups (see Note 2). 

Figures 3 and 4 display changes over time by group on 
the number of CWS and CIWS written in the first 50 words 
of the essay. These figures reveal that the HA, AA, and LA 
students in our study showed little change from pretest to 
posttest (approximately 1.5 CWS and 7 CIWS on average 
across the three groups) when the length of the text was lim- 
ited to 50 words. In comparison, students with LD showed 
more substantial changes (approximately 9 CWS and 14 CIWS) 
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FIGURE 1. Number of correct word sequences on pretests and posttests for learning disabled (LD), 
low-achieving (LA), average-achieving (AA), and high-achieving (HA) students. 


when length was limited to 50 words. These results indicate 
that for the students who were at the lowest end of the writ- 
ing performance continuum (i.e., students with LD), a fairly 
short sample of writing revealed growth over time; however, 
for students at the higher end of the continuum, a longer sam- 
ple was necessary. 

Discussion 

Results of this study provide support for the use of CBM scor- 
ing procedures in written expression. Both CWS and CIWS 
were strongly correlated with the criterion measures of the 
number of functional elements and quality ratings of the es- 
says. In addition, both measures were sensitive to change in 
student performance over time. 

This study contributes to our current body of knowledge 
about CBM in several ways. First, it provides confirmation that 
simple scoring procedures such as CWS and CIWS can be 
used as valid indicators of student performance in written ex- 
pression. Second, it extends previous work to reveal that sim- 
ple measures such as CWS and CIWS are valid not only for 
narrative and descriptive writing but also for a type of writing 
often required in secondary schools, expository essay writing. 
Finally, it adds to our confidence in the use of CWS and CIWS 


scores as general indicators of writing proficiency because the 
criterion variables in this study — functional elements and 
quality — focused on the content, coherence, and completeness 
of the writing, while controlling for basic elements of writing 
such as spelling, capitalization, and punctuation. 

The results of this study are important both practically 
and scientifically. Practically speaking, training teachers to score 
CWS and CIWS is much simpler and less time-consuming 
than training them to score functional essay elements or qual- 
ity ratings. In addition, it is much more likely that teachers will 
use simple scoring procedures such as CWS and CIWS on a 
repeated basis for monitoring growth over time than more time- 
consuming measures, such as functional elements or quality 
ratings. Scientifically, the results of this study indicate that 
researchers may also choose to use a simpler measure of stu- 
dent performance when evaluating the effect of their interven- 
tions in written expression. 

Although the results of this study are encouraging and 
contribute to our knowledge base, they also raise some impor- 
tant questions. First, based on previous research, we hypoth- 
esized that correlations between CWS and CIWS and the 
criterion variables would be in the moderate to moderately 
strong range. The magnitude of the correlations found in this 
study are large compared to those found in previous studies of 
CBM for middle school students (e.g., Espin et al., 2000). We 
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FIGURE 2. Number of correct minus incorrect word sequences on pretests and posttests for learning 
disabled (LD), low-achieving (LA), average-achieving (AA), and high-achieving (HA) students. 


speculate that these differences are due to variations in the 
time allowed for students to write. It may be necessary to give 
students more time to write, to obtain a more valid and reli- 
able sample of their performance. Indeed, the analysis con- 
ducted with 50 words supports this conclusion. Correlations 
between the predictor and criterion measures dropped when 
CWS and CIWS were counted for a 50-word sample. These 
results indicated that although length of text alone did not ac- 
count for the relationship between the predictor and criterion 
variables, it was an important factor. The need for a longer 
sample of text is also supported by the results of Parker et al. 
(1991b), who found that 6-min samples of writing did not pro- 
duce indicators that were stable measures of growth over time. 

The question that arises with respect to length of text is. 
How long is long enough? Though preliminary, our results in- 
dicate that the amount of writing time needed may vary with 
students’ level of writing proficiency. Examination of growth 
patterns by group indicated that for students diagnosed with 
LD, fairly short writing samples were sufficient for reflecting 
growth over time, but for students who were better writers, 
longer writing samples were necessary. One possible reason 
for these differences is that CWS and CIWS reflect both length 
and errors. For writers who make fewer errors, the measures 
reflect mostly length; for writers who make more errors, the 
measures reflect both length and errors. Thus, even when length 


is controlled, CWS and CIWS predict writing quality for poor 
writers but not for good writers. 

Practically speaking, the length of the sample that is 
needed to reveal changes over time must be balanced with the 
efficiency of the measurement system. The amount of writing 
time given in this study is probably too long in CBM terms. 
If teachers are to collect samples of student work on a weekly 
basis, and score and graph the data, a 35-min time frame is 
too long. In future research studies, time frames such as 3, 5, 
10, and 15 minutes should be compared for students at vari- 
ous levels of writing proficiency. Our speculations about the 
effects of length of text are preliminary and conjectural and 
must be substantiated in future research in which larger sam- 
ple sizes are employed. 

A second question raised by this study relates to the use 
of the number of words written as a CBM indicator. In the cur- 
rent study, unlike previous CBM studies, the number of words 
written correlated at a moderately strong to strong level. Fur- 
ther, eta-squared values indicated that number of words writ- 
ten was more sensitive to change in performance over time 
than was CWS or CIWS. The use of the number of words writ- 
ten score needs to be examined more closely. In none of the 
previous CBM research studies conducted at the secondary 
level has the number of words written been found to be a valid 
indicator of students’ general writing performance. We spec- 
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FIGURE 3. Number of correct word sequences on first 50 words of pretests and posttests for learning 
disabled (LD), low-achieving (LA), average-achieving (AA), and high-achieving (HA) students. 


ulate again that this finding may be related to the length of the 
sample collected in this study. This finding is worth pursuing 
further because, although CWS and CIWS are easier to score 
than quality ratings and functional elements, counting the 
number of words written is even easier to score than CWS or 
CIWS. 

Our research was limited by the small sample size, 
which may have affected the magnitude of our correlations. 
A future study with a larger sample of students with diverse 
writing abilities would allow for a more systematic examina- 
tion of some of the questions raised in this study. In addition, 
our study was limited by the use of a pretest-posttest design 
for examining change in performance. Because the order in 
which the measures were administered was not counterbal- 
anced, the observed changes may have been influenced by 
topic. This concern is somewhat diminished by the fact that 
for each student, three pretests and three posttests were ran- 
domly selected from a pool of six or more potential essays; 
thus, change scores were not calculated on the same set of 
pretests and posttests for each student. Nonetheless, topic may 
still have exerted an influence on the observed change scores. 

An additional limitation to our study is the fact that we 
examined change based on a small number of data points col- 
lected at the beginning and end of an intervention. CBM mea- 


sures, however, are designed to be given on a repeated and 
frequent basis (i.e., once a week) so that teachers can make 
decisions regarding the effects of their interventions through- 
out the school year. A stronger test of the validity and relia- 
bility of the CBM writing measures for growth monitoring 
would be to examine the technical adequacy of the growth tra- 
jectories created by the measures. (See Shin, Deno, & Espin, 
2000, for an example of this type of study using reading data 
collected at the elementary school level.) 

In conclusion, the results of this study support the use 
of CWS and CIWS as indicators of students’ general writing 
performance and introduce the possibility of using the num- 
ber of words written as a CBM indicator. The results support 
previous research on the use of CWS and CIWS as general 
indicators of performance and contribute to our knowledge 
base about the use of CBM procedures for students at the mid- 
dle school level. 
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NOTES 

1 . This data point fell within the normal range of scores for analy- 
ses involving all words, and thus was not removed for those 
analyses. 

2. The figures demonstrate that the AA group did somewhat better 
on the writing performance measures than the HA group. These 
differences were evident in the levels of performance on the func- 
tional elements and quality ratings as well. 
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